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In recent years, cyanobacteria have emerged as attractive mi- 
crobial hosts for hydrocarbon production. [1,2] However, whilst 
appealing, the practicalities of producing biofuels in cyanobac- 
teria remain challenging, requiring the identification and engi- 
neering of natural biocatalysts for alkane production and their 
integration into metabolic processes. [3] Cyanobacterial hydro- 
carbon biosynthesis is presumed to arise from fatty acid catab- 
olism involving two main enzymatic reactions, an acyl carrier 
protein reductase (AAR; converting fatty acids to aldehydes) 
followed by loss of the carbonyl group to form alka(e)nes cata- 
lysed by an aldehyde-deformylating oxygenase (ADO), known 
formerly as aldehyde decarbonylase (AD; Scheme 1). [4_7] Cyano- 
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Scheme 1. Cyanobacterial hydrocarbon production. AAR: acyl carrier protein 
reductase; ACP: acyl carrier protein; ADO: aldehyde-deformylating oxygen- 
ase; HC0 2 : formate. 



bacterial ADOs (cADOs) have been isolated and shown to de- 
carbonylate long-chain aldehydes (C 18 and above), which are 
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the presumed physiological substrates, whilst also displaying 
in vitro activity with medium-chain aldehydes (e.g., hepta- 
nal). [8 < 9] 

There is intense interest in engineering cyanobacteria to ac- 
cumulate short-chain alkanes and produce "drop in" fuels such 
as propane. This requires enzyme catalysts that can produce 
these hydrocarbons from precursors derived from central me- 
tabolism. Here, we set out to engineer improved variants of 
Procholorococcus marinus (strain MIT9313) cADO through struc- 
ture-based engineering of the substrate-access tunnel. Our 
intention was to alter the natural specificity of cADO to favour 
reactivity against short-chain over long-chain aldehydes, and 
thereby produce new catalytic modules for metabolic engi- 
neering. 

The availability of a crystal structure for P. marinus cADO 
(strain MIT9313; Joint Center for Structural Genomics; PDB ID: 
20C5) enabled us to investigate the active site architecture, 
and identify residues that may influence substrate binding. [10] 
The structure reveals a mainly alpha helical architecture, with 
a ferritin-like four-helix bundle. The latter contains the di-iron 
centre, coordinated by two histidine residues and four carbox- 
ylates from glutamate side chains. Substrates access the active 
site through a tunnel-like hydrophobic pocket, as evidenced 
by the occupancy of an unknown ligand of extended chain 
length in the 20C5 crystal structure. One end of the unknown 
ligand is located close to the di-iron centre within the sub- 
strate-binding tunnel, where iron-catalysed decarbonylation 
occurs. The mechanism of the unusual iron-catalysed decar- 
bonylation reaction has been studied recently. Two distinct 
mechanisms (oxygen dependent or hydrolytic) have been pro- 
posed. [11,12] It was originally thought that decarbonylation pro- 
ceeds hydrolytically in the absence of oxygen, [13] but more 
recent evidence suggests turnover requires it. [14,15] Despite 
some progress on the mechanistic understanding of cADO it 
remains unclear why turnover with medium/long-chain alde- 
hydes is slow (typically -3-5 turnovers per hour). With hepta- 
nal as substrate, an exponential burst phase with a /c app value 
of 0.27 ±0.03 s" 1 has been reported. [16] However, /c cat measure- 
ments under steady-state turnover conditions returned a value 
~ 1 min" 1 . 

The unknown ligand observed crystallographically copurifies 
with cADO, and is presumably derived from the host strain, 
the most likely candidate being a natural fatty acid (e.g., pal- 
mitic acid). However, the identity of this ligand has not been 
reported previously, so we independently solved the structure 
in-house, and identified the ligand using GC-MS analysis (Fig- 
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ure S1). The bound ligand(s) extracted from cADO protein sam- 
ples were identified as a mixture of long-chain fatty acids con- 
sisting of palmitic acid (molecular weight 256 Da), stearic acid 
(molecular weight 284 Da) and oleic acid (molecular weight 
282 Da), see Figure SI A. 

Analysis of the fatty acid binding site highlighted two resi- 
dues (V41 and A134) adjacent to the C9 position of the ligand 
that might influence fatty acid binding. Both residues present 
their respective side chains towards the cavity of the fatty acid 
binding pocket. We therefore hypothesised that mutating 
these residues to tyrosine and phenylalanine, respectively, may 
introduce a steric block in this position and thus impede the 
binding of fatty acid chains beyond the length of C 9 . Following 
site-directed mutagenesis, both V41Y and A134F variant pro- 
teins were overexpressed and purified to homogeneity. GC-MS 
analysis as performed for wild-type cADO (Figure S1 B) indicat- 
ed host-derived fatty acid ligands were not bound to the puri- 
fied variant proteins. 

To determine how the V41Y and A134F variants were able to 
discriminate against the binding of the native fatty acid li- 
gands, we generated crystal structures of both V41Y and 
A134F in the presence of hexanoic acid. The structures were 
solved at 1.88 and 1.67 A, respectively. A superimposition of 
V41Y and A134F reveals they have retained the same global 
architecture as wild-type cADO, with RMSDs of 0.262 and 
0.1 77 A, respectively, for all aC atoms. Previous attempts to 
crystallise wild-type cADO in the presence of shorter aldehydes 
had proved fruitless due to the presence of palmitic acid 
blocking the active site. However, clear electron density was 
observed for hexanoic acid 
(Figure 1) within the variant pro- 
teins, consistent with the lack of 
detectable palmitic acid in the 
purified enzyme forms. 

In both variant structures, 
V41Y and A134F, the mutated 
side chains encroach upon the 
palmitic acid binding site near 
the C11 position, effectively 
blocking the site and thus pre- 
venting palmitic acid binding. 
Importantly, neither mutation 
causes any occlusion of the 
active site channel that extends 
towards the di-iron centre, thus 
allowing the variants to retain 
binding and catalytic activity 
towards aldehydes shorter in 
length than approximately C 10 
(Figure 2). 

To evaluate how the engi- 
neered changes in the V41Y and 
A134F proteins altered the sub- 
strate selectivity of cADO, we 
measured the enzyme activity 
with a range of different alde- 
hyde substrates. These experi- 
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Figure 1. Ball and stick representation of hexanoic acid bound in the active 
site of A134F. Electron density shown as a blue mesh {2F 0 -F a contoured at 
1 cr). Residues are shown as sticks with blue carbons, and iron ions are grey. 



ments used phenazine methosulfate/reduced nicotinamide ad- 
enine dinucleotide (PMS/NADH) as the auxiliary reducing sys- 
tem. [17] The decarbonylation reaction was first monitored by 
following the conversion of butanal to propane by wild-type 
cADO enzyme using GC (Figure 3 A). Initial reaction velocities 
(V Q ) displayed apparent Michaelis-Menten behaviour with re- 
spect to butanal concentration (Figure 3 B). Although butanal 
is decarbonylated to propane, the /c cat (0.0031 ±0.0001 min" 1 ) 
and K m (10.1 ±0.9 itim) values indicate that that wild-type 





Figure 2. Cartoon representation of A) V41 Y and B) A134F. Internal protein cavity of wild-type cADO shown as 
a grey surface. Palmitic acid is shown as a ball and stick representation (grey). Native residues are shown as sticks 
with yellow carbons, and iron ions are shown as orange spheres. Mutated residues Y41 (A) and F134 (B) are 
shown as pink and red sticks, respectively. 
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Figure 3. A) Decarboxylation of butanal to propane using wild-type cADO in 
the presence of a PMS/NADH chemical reducing system. Reactions con- 
tained wild-type cADO (10 jim), ferrous ammonium sulfate (20 jim), PMS 
(75 (im), NADH (1 itim) and C 4 aldehyde (2 itim). Reactions were carried out 
micro-aerobically, incubated at 37 °C at 220 rpm for up to 3 h. Peaks A and B 
refer to propane and butanal, respectively. B) Michaelis-Menten plot of wild- 
type cADO activity versus butanal concentration, ranging from 0 to 40 itim. 
Inset: Wild-type cADO reaction kinetics at different enzyme concentrations. 
Circles, triangles and squares represent 10, 20 and 40 jim of wild-type cADO, 
respectively. 
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Figure 4. A) Substrate specificity of wild-type cADO, and variants V41 Y and 
A134F were evaluated from initial steady-state velocity (1/ D ) measurements 
for different aldehyde chain lengths (C 4 _ 18 ). The wild-type cADO is indicated 
in light grey, the V41Y variant in grey and A134F variant in dark grey. B) Rela- 
tive activity of wild-type cADO, V41Y and A134F variant for different alkyl 
chain lengths (C 3 _ 17 ). Wild-type cADO is set at 100%. All reactions used a 
PMS/NADH reducing system with 10 jim cADO (see the Experimental Section 
for details). 



cADO activity with this substrate is very low. In control studies, 
kinetic analyses performed with wild-type cADO and butanal 
showed that product accumulated linearly with respect to 
time and that reaction rates were, as expected, dependent on 
protein concentration (Figure 3B, inset). 

A range of aldehyde substrates from C 4 to C 18 was used to 
determine the specificity of wild-type cADO and variants V41Y 
and A134F. For each substrate, the apparent /c cat (/c app ) value 
was measured and relative activities determined (Figure 4). For 
clarity, the horizontal axis in Figure 4A and B is broken to high- 
light differences in substrate concentration used. C 4 _ 10 alde- 
hydes were screened individually at 2 itim; due to limited solu- 
bility, Cn_ 18 aldehydes were screened at 300 jlxm. 

Wild-type cADO has a broad substrate range catalysing C 4 _ 18 
aldehyde decarbonylation. With C 4 _ 10 aldehydes, wild-type 
cADO exhibited maximal activity with the medium-chain alde- 
hyde, octanal (/c app value of 0.1 22 ±0.002 min" 1 ). With longer- 
chain aldehydes (C n _ 18 ), maximal activity was observed with 
octadecanal (/c app value of 0.06 ±0.001 min" 1 ). In contrast, the 
V41Y variant cADO has maximal activity with decanal (/c app 
value of 0.07 ±0.001 min" 1 ), and the relative activity with C 4 _ 10 
aldehydes remains similar to that for wild-type cADO. Further- 



more, V41Y shows an approximately sevenfold reduction in rel- 
ative activity towards octadecanal when compared to wild- 
type cADO (/c app value of 0.01 ±0.001 min" 1 ). The substrate spe- 
cificity range of the V41Y variant is therefore narrower than 
that observed for wild-type cADO with a preference toward 
short-chain aldehydes. 

The A134F variant shows a similar specificity to V41Y, being 
almost inactive with the majority of long-chain aldehyde sub- 
strates tested. For instance, A134F retains only 11 % relative ac- 
tivity for octadecanal (/c app value of 0.007 ±0.001 min" 1 ). When 
compared to wild-type cADO, however, A134F exhibits en- 
hanced general activity with shorter-chain aldehydes, with 
maximal activity for hexanal (k app value increases to 0.21 5 ± 
0.0002 min" 1 ). In addition, the A134F variant displayed an ap- 
proximate fourfold increase in the rate of butanal consumption 
(/c app value of 0.003 ±0.0005 min" 1 ) and approximately sixfold 
increase in pentanal consumption (/c app value of 0.023 ± 
0.001 min" 1 ) compared to wild-type cADO. In an attempt to 
improve further the selectivity of cADO towards short-chain al- 
dehydes, we combined the two beneficial aromatic residues 
(V41Y and A134F) into a single cADO variant and tested this 
new variant (V41Y/A134F) against C 4 _ 10 aldehydes (Table S3). 
However, no synergistic or additive effects were observed in 
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reactions with short-chain aldehydes compared with reaction 
profiles with the single variants. 

The decarbonylation reactions reported here were initially 
performed in vitro. To demonstrate that long-chain fatty acids 
are efficiently excluded from the A134F variant both in vitro 
and in vivo, whole-cell biotransformations were performed 
(Figures 5 and S2). From control experiments, small levels of 
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Figure 5. All strains (£ coli control lacking cADO, E. coli transformed with 
wild-type cADO or the A134F variant) were cultivated in lysogeny broth. All 
reactions contained butanal (10 itim) and propane concentration was quanti- 
fied from whole-cell biotransformations, (see the Supporting Information for 
details). 



propane were detected using an untransformed Escherichia 
coli strain (0.03 ±0.01 mgL" 1 ), the origin of which is unclear as 
genome searches indicate this this organism does not contain 
cADO-related genes. In broad agreement with in vitro turnover 
data (Figure 4), the E. coli strain containing the A134F-variant 
cADO generated propane at a rate (0.46 ±0.04 mgL" 1 ) approxi- 
mately twofold greater than E. coli containing wild-type cADO 
(0.27 ± 0.04 mg L" 1 ). This elevation in activity is slightly less 
than that determined from the in vitro turnover measure- 
ments. We attribute this small difference to the approximately 
twofold lower expression of the A134F variant in E. coli com- 
pared with wild-type cADO (Figure S2). 

In summary, based on crystallographic data we have isolated 
two variant forms of cADO that were strategically engineered 
to display improved specificity for short- to medium-chain al- 
dehydes. The A134F cADO has also been shown to generate 
enhanced levels of propane production in whole-cell biotrans- 
formations compared to wild-type cADO. These studies define 
a region in the substrate channel that can be modified to 
exclude longer-chain aldehydes and improve reactivity with 
shorter-chain substrates. This simple switch in specificity releas- 
es cADO from potential complications arising from long-chain 
fatty acid binding in vivo. The A134F variant is an excellent cat- 
alytic module from which to now explore the role of second- 
shell residues to improve specificity and catalysis by cADO for 
production of drop-in hydrocarbon biofuels. 



Experimental Section 

Auxiliary PMS/NADH assay: All assays were performed under mi- 
croaerobic conditions in an anaerobic glovebox (Belle Technology) 
under a nitrogen atmosphere (oxygen maintained at <2ppm) 
unless otherwise stated. Enzyme assays were performed in potassi- 
um phosphate buffer (100 itim, pH 7.2) containing KCI (100 itim) 
and 10% glycerol. Aldehyde substrates were made up as stock sol- 
utions in DMSO. Enzymatic reactions contained cAD (10|lim), fer- 
rous ammonium sulfate (20 jlim), PMS (75 jlim) and NADH (1 itim), 
C 4 _ 10 (2 itim) or Cn_ 18 (300 jlim) aldehyde in a total volume of 
0.5 mL, respectively. 

Gas chromatography detection of volatile hydrocarbon product 
(C 3 _ 9 alkane) from enzyme reactions: Headspace analysis of vola- 
tile products (C 3 _ 9 alkane) was carried out using GC. The reaction 
mixtures were shaken continuously (190 rpm) at 37 °C. Headspace 
samples (1.0 mL) were manually drawn off, and injected into the 
GC with a syringe-lock needle at time intervals (0, 36, 72, 108, 144, 
180 min for propane and 0, 5, 10, 20, 40, 60 min for n>3 alkanes). 
All kinetic assays were performed in duplicate. 

A Varian 3800 GC equipped with a DB-WAX column (30 mx 
0.32 mm x 0.25 |um film thickness, JW Scientific) was used to detect 
and quantify the hydrocarbon released from enzyme reactions. The 
column temperature was programmed as follows: 40 °C hold for 
2 min, to 100°C at 20°Cmin" 1 (for detection of propane) and 40 °C 
hold for 2 min, to 150°C at 20°Cmin" 1 (for detection of n>3 alka- 
nes). The injector temperature was 250 °C (10:1 split), and the FID 
temperature was set at 250 °C. The carrier gas was helium at a flow 
rate of 1 mLmin" 1 . Peak identification of each alkane was achieved 
by comparison with pure alkane standards. Quantification of al- 
kanes was achieved by comparison of integrated peak with calibra- 
tion curves of standard pure alkanes. 

Gas chromatography-mass spectrometry (GC-MS) detection of 
less volatile hydrocarbon products (C 10 _ 17 ): Liquid-phase analysis 
of nonvolatile products (C 10 _ 17 alkanes) was carried out using GC- 
MS. At time intervals (0, 5, 10, 20, 40, 60, 80, 100 and 120 min), of 
the reaction (1 mL) was terminated by extraction with ethyl acetate 
(900 |LiL), containing an internal standard (0.005% limonene), and 
dried over MgS0 4 . A 1 |uL sample was then analysed using GC-MS 
on a Varian 3800 GC instrument equipped with a Saturn 2000 ion 
trap MS and a CP-8400 autosampler. A DB-WAX column (30 mx 
0.25 mm x 0.25 |um film thickness, JW Scientific) was used with the 
following temperature programme: 70 °C hold for 2 min, to 250 °C 
at 20°Cmin" 1 , hold for 2 min. The injector temperature was 250 °C 
(10:1 split), and the carrier gas was helium at a flow rate of 
1 mLmin" 1 . The transfer line, manifold and ion trap temperatures 
were set to 250, 35 and 150°C, respectively. All kinetic assays were 
performed in duplicate. 

Structural data have been deposited at RCSB Protein Data Bank 
(IDs: 4KVQ, 4KVR and 4KVS) 
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